Speech and Gaze Control for Desktop Environments

نویسندگان

Emiliano Castellina

Fulvio Corno

چکیده

This chapter illustrates a multimodal system based on the integration of speechand gazebased inputs for interaction with a real desktop environment. In this system, multimodal interactions aim at overcoming the instrinsic limit of each input channel taken alone. The chapter introduces the main eye tracking and speech recognition technologies, and describes a multimodal system that integrates the two input channels by generating a real-time vocal grammar based on gaze-driven contextual information. The proposed approach shows how the combined used of auditive and visual clues actually permits to achieve mutual disambiguation in the interaction with a real desktop environment. As a result, the system enables the use of low cost audio-visual devices for every day tasks even when traditional pointing devices, such as a keyboard or a mouse, are unsuitable for use with a personal computer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multimodal Dialogue for Ambient Intelligence and Smart Environments

Ambient Intelligence (AmI) and Smart Environments (SmE) are based on three foundations: ubiquitous computing, ubiquitous communication and intelligent adaptive interfaces [41]. This type of systems consists of a series of interconnected computing and sensing devices which surround the user pervasively in his environment and are invisible to him, providing a service that is dynamically adapted t...

متن کامل

Affordances of Input Modalities for Visual Data Exploration in Immersive Environments

There has been a consistent push towards exploring novel input, display, and feedback technologies for sensemaking from data. However, most visual analytical systems in the wild that go beyond a traditional desktop utilize commercial large displays with direct touch, since they require the least effort to adapt from the desktop/mouse setting. There is a plethora of device technologies that are ...

متن کامل

Selective use of gaze information to improve ASR performance in noisy environments by cache-based class language model adaptation

Using information from a person’s gaze has potential to improve ASR performance in acoustically noisy environments. However, previous work has resulted in relatively minor improvements. A cache-based language model adaptation framework is presented where the cache contains a sequence of gaze events, classes represent visual context and task, and the relative importance of gaze events is conside...

متن کامل

The selective use of gaze in automatic speech recognition

The performance of automatic speech recognition (ASR) degrades significantly in natural environments compared to in laboratory assessments. Being a major source of interference, acoustic noise affects speech intelligibility during the ASR process. There are two main problems caused by the acoustic noise. The first is the speech signal contamination. The second is the speakers’ vocal and non-voc...

متن کامل

How do people explore virtual environments?

Understanding how humans explore virtual environments is crucial for many applications, such as developing compression algorithms or designing effective cinematic virtual reality (VR) content, as well as to develop predictive computational models. We have recorded 780 head and gaze trajectories from 86 users exploring omnidirectional stereo panoramas using VR head-mounted displays. By analyzing...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Speech and Gaze Control for Desktop Environments

نویسندگان

چکیده

منابع مشابه

Multimodal Dialogue for Ambient Intelligence and Smart Environments

Affordances of Input Modalities for Visual Data Exploration in Immersive Environments

Selective use of gaze information to improve ASR performance in noisy environments by cache-based class language model adaptation

The selective use of gaze in automatic speech recognition

How do people explore virtual environments?

عنوان ژورنال:

اشتراک گذاری